Project: Oil production and consumption: regional and per capita investigation - [Gapminder World]

Table of Contents

Introduction

Dataset Description

Gapmindezr world is sweedish website creted to fight misconception by providing and collecting releable statistics helping to use facts to understand the world here. We have choose in this report to work with:

-Total oil production. :The total amouint of crude oil (tonne oil equivalent)

-total oil consumption :he total amount of crude oil consumption(tonne oil equivalent)

-Oil consumption per person (tonne per year per person)

-Oil poduction per person (tonne per person)

-Coutries with region sub regions table here

-Income per person (GDP per capita): gross domestic per person adjusted for difference in purshasing power(in internatoional dollars)

Question(s) for Analysis

In this report we will work to anlyse data provided to answer these question:

-What is the total oil produced and consumed in the world since the 60s, and what is the total share of each regon?

Is there a relation betwwen oil production, consumption per capita and GDP?

We need first to install Plotly libary to get beautiful Filled Area Plots that we will use in this rapport link: here

Data Wrangling

All tables needed are loaded below:

We checked all csv files in Excel to make sure names of countries used in Gapminders datasets and names of countries in regions table from Data Hub match. There is some mismatch like the "United state3 and "Russia", that we fix in Excel Then we use head() to sj=ow the first 5 raws in Jupyter

Both oil production and consupmtion Begin with datas from 1965 but consumption tablke have more lignes , this is normal because more countries consumes oil than producers and both contains the abriviation K and M resoectively for thounsands and millions, we need to convert numerical number. There is also some NaN values there are mainly country recently creted like coutries of the former URSS. We need fill these value with zeros. and to make joining region data easy and creating two tables, we need to convert to invert line and column but we will deal with that later.

There is no duplicated data in both oil tables

The oil production and consumption per caita table containt also Nan values and K and M coefficient. we will only keep data between 2000 and 2019, we limit our studu to 20 years to delete data from former countries that do not exist anymore.

Wel will need to rename "name" colum to "country" in order to join to other table

Data for GDP begin in the year 1800 and ends in 2050, for our work we will only need dat from 2000 to 2019 so we need first to keep only data from 1800 to 2019 and then remove data from 1800 to 1999

We need now to reduce df_opc and df_occ to only year of study meaning 2000 to 2019

we need to change the structure of oil production, oil consumption and income per capita data frame melt function. Melting data is the process of turning columns of our data into rows of data.

### Data Cleaning

We will crete three table, a table containg oil production , oil con sumption abnd we join to them region, and another table contraing oil production and consuption per capita and join region

We need first to fill remaing null datas overall oil production an dconsumption table with zeros because it lake sense that means a country has not discovered oil yet or it ressource dry out.

we will do Next relace K and M by its values we will do that using reolace function

country colum has an object type we need to covert it to integer

Exploratory Data Analysis

In this part we will crete two Filled Area Plots ccontainning oil production and oil consumption per region using Plotly Libaryhere

Next we will use the last table investugate if there is a correlation between GDP and oil

Research Question 1: What is the total oil produced and consumed in the world since the 60s, and what is the total share of each regon?

We can see an overall increasion of production until around 1979 and then a crush as price collaspse tha episode is know by "1980s oil glut" and around 2010 an increase of production due to technical innovation and the avent of Shale oil and gas extraction,

We clearly see the same patten let now see if this is general trend not local to thosee coutries

We will use here Plotly to get beautiful Filled Area Plots

We see the same pattern but with a more gentle slope

Research Question 2: Is there a relation betwwen oil production, consumption per capita and GDP?

We can see a relative decline in oil prodution and consumlption per capita but a rise on the GDP

The distubution of GDP per capita is right skewwed but a tendancy to little more evenly distribtion in 2019 than 2000

Oil consumption, production and GDP per capita show a strong positive correlation with a correlation coefficient of Pearson R = 0.77 for production and R=0.81 for consumption.

Conclusions

  • general increase trend in overall oil production and consumption since the the 60's but marked by around 1979 and then a crush as price collaspse tha episode is know by "1980s oil glut".And ezven the consumption grow, the production also follow as new technologies emerged like Shale oil.

  • strong correlation of education with GDP per capita and between oil production per capita and oil consumption per caoita . We can see clearly that the oil was the main drive of econolmic growth.

Limitations

  • The amount of oil production doesn't mention the origin of oil, onshore fiels, offshore condenstae or shale oil, and the data abd thus cannot do furthur investigation

  • Gas production and consumption was not available as i wouil had offered a much broader picture on fossil ennergy.

  • Even if there is a clear relationship betwwen GDP per capita and oil productioon and consupmtion per capita, other energies forms like coal, wind solar would have drawn a broader picture and give us an idea of ennergies transition

  • Beside GDP we may need another metrics to measure the wellbeing and livebility like for example the human developement index